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Abstract 

Let (X t )o<t<T be a one-dimensional stochastic process with independent 
and stationary increments, either in discrete or continuous time. This paper 
considers the problem of stopping the process (X t ) "as close as possible" to 
its eventual supremum My := sup 0<t<T X t , when the reward for stopping 
at time r < T is a nonincreasing convex function of Mt — X T . Under 
fairly general conditions on the process (Xt), it is shown that the optimal 
stopping time r takes a trivial form: it is either optimal to stop at time 
or at time T. For the case of random walk, the rule r = T is optimal if the 
steps of the walk stochastically dominate their opposites, and the rule r = 
is optimal if the reverse relationship holds. An analogous result is proved 
for Levy processes with finite Levy measure. The result is then extended 
to some processes with nonfinite Levy measure, including stable processes 
and processes whose jump component is of finite variation. 
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1 Introduction 



In recent years there has been a great deal of interest in optimal prediction prob- 
lems of the form 

BupE[/(Mr-X T )], (1.1) 

T<T 

where / is a nonincreasing function, (X t )t>o a one-dimensional stochastic process, 
T > a finite time horizon, and '■= sup{X t : < t < T}. The supremum 
in (11. ip is taken over the set of all stopping times adapted to the process (X t )t>o 
for which P(r < T) = 1. For the case of Brownian motion, the problem (II. ip has 
been investigated for several reward functions /, though it is often formulated as 
a penalty-minimization problem in the form 

inf E[f{M T -X T )}, (1.2) 

where / := — /. For instance, Graver sen et al. [7] solved ( 11.21) for standard 
Brownian motion and f(x) = x 2 . Their result was generalized to f(x) = x a for 
arbitrary a > by Pedersen [9], who also considered the function / = X[o,e] for 
e > in (II. ip . Du Toit and Peskir [5] were the first to extend these results (for 
power functions /) to Brownian motion with arbitrary drift, which required an 
entirely new approach. More recently, Shiryaev et al. [11] considered the problem 
( 11. ip for Brownian motion with drift and f(x) = e~ ax , where a > 0. In that 
case the problem has the natural interpretation of maximizing the expected ratio 
of the selling price to the eventual maximum price in the Black-Scholes model 
for stock price movements. They observed that when the drift parameter lies 
outside a certain critical interval, the optimal rule r* becomes trivial; that is, 
either r* = or r* = T. A year later, Du Toit and Peskir [6] managed to prove 
that the optimal rule is trivial also in the critical interval. More precisely, their 
result was that r* = when the drift is negative, and r* = T when the drift is 
positive. While this may seem intuitively quite plausible, it is nontrivial to prove. 
Since the optimal rule changes abrubtly from to T as the drift parameter passes 
through 0, Du Toit and Peskir [6] called it a "bang-bang" stopping rule. They 
also showed that for the (seemingly quite similar) problem ( 11 .2p with f(x) = e ax , 
the optimal rule is not of bang-bang form, but transitions from r* = to r* = T 
in a nontrivial way throughout the critical interval. 

In the discrete-time setting, an analogous result for Bernoulli random walk 
was obtained later the same year by Yam et al. [12], using ideas from [6]. Here 
we put T = N, a positive integer, and write X n instead of X t , where {X n } < n <7v 
is a simple random walk with parameter p. Yam et al. [T2] considered both the 
function / = x , the characteristic function of the set {0} (in which case the 
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expectation in (jl.ip is just the probability of stopping at the "top" of the random 
walk) and the function f(x) = e~ ax , and concluded that in both cases, the optimal 
rule is of bang-bang type. Precisely, the optimal rule is r = N when p > 1/2; 
t = when p < 1/2; or any stopping rule r satisfying P(X T = M T or r = N) = 1 
when p — 1/2. It is worth noting that the case f = Xo had already been considered 
for general symmetric random walks more than 20 years earlier by Hlynka and 
Sheahan [8]. 

The results for both discrete and continuous time were recently extended in 
Allaart [1] , where it is shown that the bang-bang principle holds for both Bernoulli 
random walk and Brownian motion with drift whenever / is nonincreasing and 
convex. Equivalently, it holds for problem (jl.2p when / is nondecreasing and 
concave, which is the case, for instance, for the natural penalty function f(x) = x a 
with < a < 1. Allaart [lj gives simple sufficient conditions on / for the optimal 
rules to be unique in the discrete-time case, and necessary and sufficient conditions 
for the case of Brownian motion. 

The aim of the present paper is to extend the result further still, to include 
more general random walks as well as certain Levy processes. First, in Section 
[21 it is shown that the bang-bang principle holds for any random walk whose 
increments stochastically dominate their opposites, or vice versa (see Theorem 
12.11 below). In Section [3] an analogous result is proved for Levy processes, first 
for the case of finite Levy measure (Theorem 13.21) . then for the more general case 
(Theorem 13. 9p . This appears to require some notion of drift, and therefore it 
seems necessary to impose some additional conditions pertaining to the "small 
jumps" of the process. One of these conditions can be omitted in the case when / 
is continuous and bounded (Theorem 13. 121) . but the author does not know whether 
it is needed in the general case. The extra conditions may seem restrictive, but 
they are satisfied by several commonly studied types of Levy processes including 
subordinators and symmetric stable processes. 

A possible application of this research is in finance. Suppose you buy a share 
of stock on the first day of the month, which you must sell some time by the end 
of the month. Perhaps the stock price follows a random walk in discrete time, and 
your objective is to maximize the probability of selling the stock at the highest 
price over the month. In that case, let X t be the random walk, and let f = Xo- Or 
perhaps the stock price follows an exponentiated Levy process, such as geometric 
Brownian motion, and your goal is to maximize the expected ratio of the price at 
the time you sell to the eventual maximum price. In that case, let X t be the Levy 
process, and put f(x) = e~ ax , where a > 0. In both examples the results of this 
paper imply, under suitable conditions on the process X t , that it is either optimal 
to sell the stock immediately, or to keep it until the last day of the month. In fact, 
the result for the second example remains valid if one takes as objective function 
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an arbitrary increasing convex function g of the price ratio, since if ^ : (0,oo)->IR 
is increasing and convex, then f(x) = g(e~ ax ) is decreasing and convex. 

While this paper was in preparation, the author learnt that D. Orlov has also 
extended the bang-bang principle to certain Levy processes. Unfortunately, an 
English version of his paper was not available at the time the present article 
was nearing completion. In addition, a paper by Bernyk et al. [I] was posted 
on the arXiv in which problem (jl.2p is solved for stable Levy processes of index 
a G (1, 2) with no negative jumps, for the penalty function f(x) = x p with p > 1. 
(We observe that for this case, / = — / is not convex, so the results of the present 
note do not apply; indeed, the optimal rule is nontrivial and its determination 
requires significant analytical tools.) Some of the preparatory work for this last 
paper was done in [3]. 

2 The maximum of a random walk 

In this section, let {X n } n=0i i,... be a random walk with general steps satisfying 
a form of skew-symmetry as follows: X = 0, and for n > 1, X n = Ylk=i 
where £1,^2, • • • are independent, identically distributed (i.i.d.) random variables 
for which either £1 > st — £1 or £1 < st — £1. Here, > st denotes the usual stochastic 
order of random variables. Let M n := maxo<fc<„Xfc for n = 0, 1, . . . , N, where 
N G IN is a finite time horizon. For a nonincreasing function / : [0, 00) — > IR, 
consider the optimal stopping problem 

sup E[f(M N -X T )}, (2.1) 

0<t<N 

where the supremum is over the set of all stopping times t < N adapted to the 
natural filtration {J-'njo^n^iv of the process {X n } < n < N . We note that since / is 
bounded above, the expectation in (12. ip always exists, though it could take the 
value —00. 

The above setup includes Bernoulli random walk with arbitrary parameter 
p G (0, 1) as a special case, but is of course much more general. 

Theorem 2.1. Assume that either £1 > st — £1 or £1 < st — £i, and let f : [0, 00) — > 
IR be nonincreasing and convex. Consider the problem (12. ip . 

(i) If £1 > st — £1, the rule t = N is optimal. 

(ii) If £1 < st — £1, the rule r = is optimal. 

(Hi) If ^1 — — £i ; any rule r satisfying P(X T = M T or r = N) = 1 is optimal. 
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Remark 2.2. By the assumption of convexity / must be continuous on (0, oo), 
but it may have a jump discontinuity at x = 0. Thus, in particular, the theorem 
covers the important case / = xo, the characteristic function of the set {0}. In 
that case, the problem comes down to maximizing the probability of stopping at 
the highest point of the walk, so it can be thought of as a random walk version of 
the secretary (or best-choice) problem. 

Example 2.3. The condition £i > st — £i in statement (i) cannot be replaced by 
the condition E(fi) > 0. For instance, let P(fi = 3) = 1/3 = 1 - P(fi = -1), let 
/ = Xo, and take n = 2. Even though E(£i) = 1/3 > 0, the optimal rule is easily 
seen to be r = rather than r = 2. 

In case of Bernoulli random walk, simple sufficient conditions on the function 
/ such that the optimal rules given above be unique are given in [Ij. There 
an example is also given to show that without convexity of /, the conclusion of 
Theorem 12.11 may fail in general. 

The proof of Theorem 12.11 uses the following generalization of Lemma 2.1 in 



Lemma 2.4. Let f be as in Theorem \2. 1\ and suppose £i > st — £i. Then 



for all n < N and all z > 0. 

Since the statement of the lemma involves only expectations, we may construct 
the random walk on a convenient probability space. Recall first that if X > st Y, 
then X and Y can be defined on a common probability space (fi, J 7 , P) so that 
X(co) > Y(cu) for all uj G Q. Thus, on a sufficiently large probability space, 
we can construct the random variables £i, ...,£iv together with another set of 
random variables £i,...,£jv such that the random vectors (£i> • • • > (6vj Cat) 
are independent, & = — £i for each i, and & > & for each i. Let X = 0, 
and X n = J2t=i£ki for n = 1^2,...,^. Finally, define M n := max <K n Ifc, 
n — 0, 1, . . . , N. Clearly, X n > X n and M n > M n for every n. 

It is also useful to define 



0- 



E[f{z VM„- X n )] > E \f(z V (M n - X n ))] 



(2.2) 



n 



M n — X n and Z, t 



n 



M n -X, 



n = 0,1, 



,N. 



One checks easily that 



71 = 0,1,... 



AT. 



(2.3) 
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The key to the proof of the lemma is that, for each fixed n, 

(M„-X n ,X n ) = (M n ,-X n ), (2.4) 
as follows from an easy time-reversal argument. 



Proof of Lemma 2.4 ■ The lemma holds trivially (with equality) when z = 0, so 
assume z > 0. We must first deal separately with the case when E[f(z V M n — 
X n )} = — oo. Let a := [f(z) — f(0)]/z. Then the convexity of / implies that, for 
all u > 0, 

f{u + z)-f{u)>az. (2.5) 

Using the algebraic inequality z V m — x < zV(m- x) + z (valid for z > and 
m > 0) and the fact that / is nonincreasing, we get 

f(z V m — x) > /(^ V (m — a;) + z) > f(z V (m — a;)) + az, 

in view of ([23]) . Thus, if E[f(z V M n -X n )} = -oo, then E [f(z V (M n -X n ))] = 
— oo as well, and the lemma holds in this case. 

Assume for the remainder of the proof that E[f(z V M n — X n )] > — oo. Since 
n is fixed, we omit the subscripts and write M = M n , X = X n , Z = Z n , and 
similarly for their tilded counterparts. Let 

h(z,m,x) := f(z V m — x) — f [z V (m — x)), 

so that it is to be shown that 

E[h(z,M,X)\ > 0. (2.6) 

The above expectation exists and is finite, because az < h(z,m,x) < \a\z. We 
begin by writing 

E[h(z, M, X)} = E[h(z, M, X); X > 0] + E[h(z, M,X);X < 0]. 

Using (12 .4p . we can write the second expectation as 

E[h(z, M, X); X < 0] = E[h(z, M-X, -X); X > 0]. (2.7) 

On the other hand, we claim that 

E[/i(z, M, X); X > 0] > E[/i(z, M, X); X > 0]. (2.8) 
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To see this, note that h(z, M, X) = on {X > 0, M — X > z}, and hence, 

h(z, M, X) l(X > 0) = (f(z V M - X) - f(z)) l(X > 0, M -X < z) 
= (/(max{z - X, Z}) - f(z)) l(X >0,Z<z) 

> (/(max{ Z - X, Z}) - f(z)) l(X >0,Z<z) 

> (/(max{ Z - X, Z}) - f(z)) l(X >0,Z<z) 
= h(z,M,X)l(X > 0). 

Here the first inequality follows since {X > 0, Z < z] C {X > 0, Z < z} by (|2.3p . 
max{z — X, Z} < z on {X > 0, Z < z}, and / is nonincreasing. The second 
inequality follows since / is nonincreasing and max{z — X, Z} < max{z — X, Z}. 
Combining ( 12.7ft and ( 12. 8ft . we obtain 

E[/i(z, M, X)] > E[h(z, M, X) + M - X, -X); X > 0]. (2.9) 

Next, the convexity of / implies that for all < x < y and all d > 0, 

fix) - f(x + d)> f(y) - f(y + d), (2.10) 

as is easily checked. Thus, for z > and < x < m, we have 

/i(z, m, x) + /i(z, m — x, —x) = [f(z V m — x) — f(z V (m — x))] 

+ [/(«V(m-i)+i) — /(z V m)l 
= [f(z V m — x) — /(z V m)] 

— f/(zV(m- x)) — /(z V (m — x) + x)] 

>0, 

where the inequality follows by ( I2.10p with d = x, since x > implies that 
z V m — x < z V (m — x). This, together with (12.91) . yields (12. 6p . □ 



Corollary 2.5. Under the hypotheses of Lemma\2.4 



E[f(z V M n - X n )\ > E[f(z V M n )\ (2.11) 
for all n < N and all z > 0. 

Proof. By ([2.40 . the inequality (12.2p can be expressed alternatively as 

E[f(z V M n - X n )] > E[f(z V M n )]. (2.12) 

But E[/(zVM n )] > E[/(zVM„)], since M n < st M n and / is nonincreasing. Thus, 
(12~TTD follows. □ 
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Proof of Theorem \2.1\ The main idea in the proof below is essentially due to Du 
Toit and Peskir [6]; see Yam et al. p2] for the discrete-time case. 

(i) Suppose first that £1 > st — £1. Construct the random variables X k , Mk, 
Zk and £k, Xk, Mk and Zk on a common probability space as in the discussion 
following the statement of Lemma 12.41 Define the cx-algebras 

Gk ■■= o"({fi, • • • , £*, |i, • • • , 4}), = 0, 1, . . . , N. 

It will be important later in the proof that the increments Xk — Xj and Xk — Xj 
are independent of Gj, for all < j < k. Note further that if the stopping time 
t = N is optimal among the set of all stopping times relative to the filtration {Gk}, 
then it is certainly optimal among the stopping times relative to {J^}. Thus, it 
is sufficient to show that 

E[f(M N - X T )} < E[f(M N - X N )] (2.13) 

for any stopping time r relative to {Gk}- Define the functions 

G(k,z) :=E[/(zVM fe )], D{k,z) := E[f(z V M k - X k )], 

for z > and = 0, 1, . . . , N. Note that and D(k, z) can possibly take 

the value — oo. Let r < N be any stopping time. An easy exercise using the 
independent and stationary increments of the random walk {Xk} leads to 

E[f(M N - X T )\G T ] = G(N - r, Z T ), (2.14) 

and 

E[f(M N -X N )\g T }=D(N-r,Z T ). (2.15) 
Now Corollary 12.51 says that D(k,z) > G(k,z), and hence 

E[f(M N -X T )\g T ] < E[f(M N -X N )\g T }. 

Taking expectations on both sides gives (I2.13p . as desired. 

(ii) Suppose next that £i < st — £i. Apply again the construction following the 
statement of Lemma [2.41 but this time with £j < ^ for all i. Observe that all the 
other relationships between random variables and their tilded counterparts are 
now reversed as well, i.e. 

Xk < X k , M k < M k , Z k > Z k , 

for k = 0,1, . . . , N. Define the filtration {Gk} and the function G(k, z) as in the 
proof of part (i) above, and let 

D(k,z) :=E[f(zVM k -X h )\. 
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In place of (12.121) . we now have the inequality 



E[f(zV M k - X k )]>E[f(zV M k )}, 

or in other words, D(k, z) > G(k, z). Furthermore, the fact that / is nonincreasing 
implies that G(k, z) is nonincreasing in z, and therefore, 

GiN-jtZ^KGiN-jJs) 

for each j. By (fUD, E[f(M N )] = E[f(Z N )]. Putting these facts together, we 
obtain for any stopping time r relative to {Q k }, by the same kind of reasoning as 
in the proof of part (i), 

E[f{M N - X T )} = E[G{N - r, Z T )\ < E[G(N - r, Z T )\ 

< E[D(N - r, Z T )\ = E[f(Z N )] = E[f(M N )]. " * 

Hence, the rule r = is optimal. 

(iii) Suppose finally that £1 = — £1. This is a special case of part (i), so the 
rule r = iV is optimal. Now let r be any stopping time such that with probability 
one, X T = M T or t = N. _ Since G(0,z) = f(z) = D(0,z) for all z > and 
G(k,0) = E[f(M k )) = E[f(Z k )} = E[f{Z k )\ = D(k,0) for all k, (EHJ) and (EHJ 
give equality in (I2.13p . Hence, r is optimal. □ 



3 The maximum of a Levy process 

A careful study of the proofs in the previous section reveals that the essential 
property of the random walk is its independent and stationary increments. Fur- 
thermore, in order to construct the random walk {X n } and its dual {X n } on a 
common probability space in such a way that the increments of {X n } uniformly 
dominate those of {X n } (or vice versa), the step-size distribution had to satisfy 
a type of skew symmetry. With this in mind, we can now extend the result to a 
much larger class of stochastic processes. 

The general continuous-time analog of a random walk is a Levy process, which 
is defined as a stochastic process on [0, oo) with independent and stationary in- 
crements which starts at and is continuous in probability. Following standard 
practice, we assume also that the process has almost surely right-continuous sam- 
ple paths with left-hand limits everywhere (or, for short, that the process is roll). 
If X = (X t )t>o is a (one-dimensional) Levy process, it is uniquely determined by 
the Levy-Khintchine formula 

E [e iuXt ] = e tri{u \ 
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where 

ri(u) = iTu - — + / [e luy - 1 - i«yx(-i,i)(y) (3-1) 

^ JlR\{0} 

In this expression, the Levy measure v satisfies J^^iy 2 A l)v(dy) < oo, but 1/ 
need not be finite. We say that X is generated by the triplet (7, a 2 , 2/). 
Define the supremum process M = (M t )t>o by 

M t := sup X s , t > 0. 

0<s<t 

If i/ is finite, then X is simply the sum of a Brownian motion with drift and 
a compound Poisson process, and it is straightforward to adapt the result of the 
previous section. This is done in Subsection 13.11 below. If v is not finite, however, 
complications arise in attempting to couple the process X with its dual, and some 
additional conditions appear to be needed to overcome these difficulties. This is 
made precise in Subsection 13.21 Finally, in Subsection I3.3[ we eliminate one of the 
extra conditions in the case when / is continuous and bounded. 

3.1 The case of finite v 

We consider first the case when v is finite. Then we may put 

6:= 7- / yv{dy), 
Jo<\y\<l 

and express X t pathwise in the form 

N(t) 

X t = bt + aB t + J2^ (3-2) 

where B t is a standard Brownian motion, £1,^2, ■■■ are i.i.d. random variables 
with distribution u/\u\, and (N(t)) t >o is a Poisson process with intensity \u\. In 
this representation, the Poisson process, the Brownian motion and the £j's are all 
independent of one another. 

Definition 3.1. Let X = (X t ) t > be a Levy process of the form (13. 2p . with finite 
Levy measure v . 

(i) X is right skew symmetric (RSS) if b > and v((a, 00)) > is(( — 00, — a)) 
for all a > 0. 

(ii) X is left skew symmetric (LSS) ifb < and v((a, 00)) < v((— 00, — a)) for 
all a > 0. 
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(Hi) X is symmetric if b — and v((a, oo)J = u((— oo, — a)) /or a// a > 0. 

Note that the condition regarding i/ in the definition of RSS is equivalent to 
£1 > st — £1, because if the inequality holds for all a > 0, it holds for all a e 1R. 
The following result is the analog of Theorem 12.11 for a Levy process with finite 
Levy measure v. 

Theorem 3.2. Let X = (X t )t>o be a Levy process with finite Levy measure v, 
adapted to a filtration {Tt\, such that X t — X s is independent of T s for all < 
s < t. Assume X is either RSS or LSS, and let f be as in Theorem \2.1\ For fixed 
T > 0, consider the problem 

sup E[f(M T -X T )\, (3.3) 

0<t<T 

where the supremum is over all stopping times r relative to the filtration {Ft} with 
P(r<T) = l. 

(i) If X is RSS, the rule t = T is optimal. 

(ii) If X is LSS, the rule r = is optimal. 

(Hi) If X is symmetric, any rule t satisfying P(X T = M T or r — T) — 1 is 
optimal. 

If v = 0, then X is a Brownian motion with drift. Thus, the above theorem 
generalizes recent results of Shiryaev et al. [IT] . Du Toit and Peskir P, Section 4] 
and Allaart [lj. 

Definition 3.3. Let X = (X t ) t > be a Levy process . The dual process of X , de- 
noted X , is a process such that (X t ) t > — (—X t )t>o- The dual supremum process, 
denoted M, is the process defined by M t := sup 0<s<< X s , for t > 0. 

If X is a Levy process generated by the triplet (7,o~ 2 ,^), then X is a Levy 
process with triplet (— 7,<t 2 ,z>), where v(A) = u(— A) for any Borel set A <Z JR. 
Note that if X is RSS, then X is LSS and vice versa. 

Lemma 3.4. Let X be any Levy process. Then, for each fixed t > 0, 

(M t -X t ,X t )±(M t ,-X t ). 
Proof. Define the time-reverse process 

x , . = f - Xt-, if < s < t 

s ' \-X t _, £s = t. 
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Then (X' s : < s < t) is a Levy process with rcll sample paths, and (see [TOl 
Proposition 41.8]) (X' s : < s < t) = (X s : < s < t). Since X t is continuous in 
probability it follows that P(X_ = X t ) = 1, and since X t has almost surely rcll 
sample paths, it is straightforward to verify that sup 0<s<t X s = sup 0<s<t X s _ with 
probability one. Thus, for all z, x e M, 

P(M t -X t <z,X t <x)=p( sup (X s - X t ) <z,X t <x 

\0<s<t 



= P swp(X t _ s -X t )<z,X t <x 

\0<s<t 

= p ( sup pr (t _ s) _ - Xt_) < x t _ < x] 

\0<s<t J 

= P ( sup X' s < z, -X' t < x 

\0<s<t 

= p(m;<z,-x;<x), 

where, in the last step, we have used that P{X[_ = X' t ) = 1, and so sup 0<s<t X' s = 
sup 0<s<t X' s almost surely. □ 

Proof of Theorem \3.2[ Assume for the moment that X is RSS. Recall the repre- 
sentation (13 .2p . On the same probability space on which the process X is defined, 
we construct the dual X as follows. For each i e IN, we can construct out of 
£i (using an external randomization if necessary) a random variable & such that 

= and £j > ^ pointwise. Now put 

N(t) 

X t :=-bt + aB t + J2ii, * > °- 

8=1 

Then it is easy to see that (X t ) t > = (—X t ) t > , and moreover, the processes X 
and X satisfy the property that, for all < s < t and for all u G fi, 

X t (u;) - X» > X t (w) - X». (3.4) 

For t > 0, define 

Z t := M t - X, Z t := M t - X 4 . 
As in Section [2j it follows from (13. 4p that 

M t > M t and Z t < Z t for all t > 0. 
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Using these relationships and Lemma I3.4[ we can show in exactly the same way 
as in the proof of Lemma I2.4[ that 

E[f(z V M t - X t )} >E[/(zV (M t - X t ))] 

for allt > and all z > 0. 

Next, for t > 0, let Qt be the smallest a-algebra containing both Tt and 
cr({X s : < s < t}). Then {Qt}t>o is a filtration with respect to which both 
X and X are adapted, and for each < s < t, both X t — X s and X t — X s 
are independent of Q s . The rest of the proof is now the same (modulo subscript 
notation) as the proof of Theorem 12.11 where the analogs of ( I2.14p and ( I2.15P 
follow since X, being a Levy process, obeys the strong Markov property. □ 

Question 3.5. It is clear that when X is RSS, we have X t > s t X t for all t > 0. 

Does the converse of this statement hold? 



3.2 The general case 

For a general Levy process with nonfinite Levy measure z/, the construction of 
the previous subsection is no longer possible because the jump times are dense in 
the time interval [0,T]. Here we shall use the fact that a general Levy process 
on [0, T] can always be obtained as the almost sure uniform limit of a sequence 
of processes of the form (13. 2\\ . However, in order to ensure that this can be 
done while preserving the uniform domination of increments (i.e. ( 13 .4p ). some 
extra conditions appear to be needed. Let the Levy-Khintchine representation of 
X = (X(t)) t > be given by fl3T$. 

Definition 3.6. We say X is balanced in its small jumps (BSJ), if 

L := lim / yu(dy) exists and is finite. (3-5) 

£ +° Je<\y\<l 

This condition is always satisfied when v is symmetric on a sufficiently small 
interval (— e, e) where e > 0, or when / 0< | J/ | <1 \y\v(dy) < oo. (In the latter case, the 
non-Gaussian part of X has finite variation.) In the case when / 0< i J/ | <1 \y\v{dy) = 
oo, ( 13. 5 p may be interpreted as saying that v is almost symmetric in a sufficiently 
small neighborhood of the origin. Roughly speaking, this means that we allow 
the small jumps of the process to be dense in time, provided that the positive and 
negative jumps more or less balance each other. It allows us to still think of the 
number 7 — L as the 'drift' of the process. 

It is clear that if X is BSJ, then so is its dual X. 
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Denote by v the dual measure of u, so that v(A) = A) for A C IR. If /x 
and f are measures on IR and C IR, we say // majorizes v on if //(.F) > ^(-P 1 ) 
for every F G E. 

Definition 3.7. Lei X = (X(t)) t >o be a Levy process. 

(i) We say X is strongly right skew symmetric (SRSS) if all of the following 
hold: 

(a) X is balanced in its small jumps; 

(b) 7 > L, where L is the limit in ( 13. 5p . - 

(c) is((a, oo)) > v((— oo, —a)) for all a > 0; 

(d) There exists e > such that v majorizes v on (0, £). 

(ii) We say X is strongly left skew symmetric (SLSS) if X is SRSS. 
(Hi) We say X is symmetric 2/7 = and v — v. 

Remark 3.8. (a) If X is symmetric, then it is both SRSS and SLSS, since (13. 5p 
holds with L = 0. 

(b) If X is SRSS (resp. SLSS) and v is finite, then X is RSS (resp. LSS), since 
b = 7 — L. The undesirable fourth condition in the definition of SRSS seems to be 
needed in order to carry out the pathwise construction of X and its dual, below. 
At this point, the author does not see how to get around this technical difficulty, 
except in the special case when / is bounded and continuous (see Subsection 13.31 
below). 

We can now state the result for the most general case. 

Theorem 3.9. Let X = (X(t)) t >o be a Levy process, and let f be as in Theorem 
\2.1\ For fixed T > 0, consider the problem (13.31) . 

(i) If X is SRSS, the rule r = T is optimal. 

(ii) If X is SLSS, the rule t = is optimal. 

(Hi) If X is symmetric, any rule r satisfying P(X(r) = M(r) or t = T) = 1 is 
optimal. 

Example 3.10. (Stable processes) Suppose X is a stable Levy process with index 
of stability a (0 < a < 2). If a = 2, then X is just a Brownian motion with drift, 
and the optimal rule is already specified by Theorem 13.21 (In fact, in this case 
the optimal rules are unique except for some trivial cases; see Allaart pQ.) 
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If a < 2, then cr = and the Levy measure v is of the form 

v (te) = f^X(o,oo)(z) + j^j^X(-oo,o)(^ dx, 

where c\ > 0, c 2 > 0, and ci + c 2 > (see, e.g. Sato [lOj, p. 80). If follows that 
if 1 < a < 2, then X is BSJ if and only if c\ = c 2 , in which case v is symmetric. 
In that case, X is SRSS if 7 > 0, and X is SLSS if 7 < 0. On the other hand, if 
< a < 1, then the BSJ condition is always satisfied with 

/ 7 \ ci - c 2 
xv[dx) = , 

0<|as|<l 1 — a 

and X is SRSS if 7 > L > 0; or similarly, X is SLSS if 7 < L < 0. 

Note that in the stable case, condition (d) in Definition 13 .7l is satisfied whenever 
(a)-(c) are. 

The proof of Theorem 13.91 hinges on the following construction. Once this is 
accomplished, the rest of the proof is the same as before. 

Lemma 3.11. Let X be a SRSS Levy process. Then, on a suitable probability 
space (Q, J 7 , P) ; we can construct X and its dual X in such a way that there exists 
a set fl C Q with P(fio) = 1 such that, for all < s < t and for all uj G Qo, 

X(t; u) - X(s; u) > X(t; to) - X(s; to). (3.6) 

Proof. Let e > be as in the definition of SRSS. Then X(t) can be expressed by 
the Levy-Ito decomposition 

X(t) = it + aB(t) + \ yN'(t, dy) + [ yN(t, dy), 

where B(t) is a standard Brownian motion on H, 7' := 7 — / e< i J/ i <1 yv(dy), 
{N(t, -))t>o is a Poisson random measure with intensity measure v which is in- 
dependent of the Brownian motion, and N'(t, ■) is defined by 

N'(t,dy) = N(t,dy) -tu{dy), t>0. 

In general, the integrals J^ <£ yN(t,dy) and f, y , <e yis(dy) need not converge, but 
the 'compensated sum of small jumps', f,, <e yN'(t,dy), always does. 

Now we will construct a sequence of Levy processes Yi,Y 2 , . . . and their duals 
Yi, Y 2 , . . . , as follows. Let e = e\ > e 2 > ■ ■ ■ be a sequence of numbers decreasing 
to zero. Define first 

Y 1 (t) := (7 - L)t + aB(t) + / yN(t, dy). 

J\y\>e 
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Then Yi has finite Levy measure u±, where v\ is the restriction of v to the set 
{y '■ \y\ > £}■ Clearly fi((a, oo)) > V\ ((— oo, — a)) for all a > 0, since V\ simply 
inherits this property from v. Since 7 > L, we can construct Yi and its dual Yi on 
the same probability space so that these processes satisfy the increment property 
(GO}. Next, for n > 2, let 



Y n (t)= / yN(t,dy), 

J en<\y\<e n -i 

and note that by the usual independence property of Poisson point processes, the 
processes Y n , n G IN may be constructed independently of each other. Now for 
each n > 2, Y n is a compound Poisson process with (finite) Levy measure v n , 
where v n is the restriction of v to the set {y : e n < \y\ < £ n -i}- Since v majorizes 
v on (0,e), it follows that z/ n ((a, 00)) > z> n ((a, 00)) for all n > 2. (Note that 
this fact would not be guaranteed without the fourth condition in the definition 
of SRSS.) Thus, we can construct Y n and its dual Y n together as in the previous 
subsection in such a way that these processes satisfy (13.41) . 
Finally, put 

X n (t) := Y 1 (t) + ■■■ + Y n (t), X n (t) := Y 1 (t) + ■■■ + Y n (t) 

for n G IN, so that X n is the dual of X n . Since the property (13 .4p is clearly 
preserved under addition of two or more processes, we have that, for all < s < t, 

X n (t) - X n (s) > X n (t) - X n (s) (3.7) 

pointwise on Q. Finally, note that X n (t) can be written as 



X n (t) = (7 - L)t + aB(t) + I yN(t, dy) 

J |»|>£n 

= ln t + aB{t)+ j yN\t,dy)+ f yN{t,dy), 

J £n<\v\<e J\v\>e 



where 



7„ := 7 - L + / yv{dy). 

J e n <\v\<e 



' £n<\y\<£ 

By ([33]), 7 n 7', and it follows from Theorem 2.6.2 in [2\ that X n (t) X(t) 
uniformly in [0, T] with probability one, as long as the sequence {e n } decreases 
fast enough so that 

/ yMdy) < ^ (3.8) 

J0<\y\<£ n ° 

for every n. Similarly, X n (t) — > X(t) uniformly in [0, T] with probability one. 
And, by taking limits in (13. 7p . we see that X and X satisfy (13. 6 p everywhere on 
the set on which both processes converge. □ 
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3.3 The case of bounded and continuous / 

In general, it seems difficult to eliminate the unnatural condition (d) in the defi- 
nition of SRSS, except when the reward function / is bounded and continuous on 
[0, oo). This case includes, for instance, the natural reward function f(x) = e~ ax 
with a > 0. 

Say a general Levy process X = (X(t)) t >o with Levy-Khintchine representa- 
tion (13. ip is right skew symmetric (RSS) if 

7 > liminf / yv(dy), (3.9) 

JS<\y\<l 

and v{{a, oo)) > z/((— oo, a)) for all a > 0. Say X is left skew symmetric (LSS) if 
X is right skew symmetric. 

Theorem 3.12. Let X = (X(t)) t > be a Levy process, and let f : [0, oo) — > 1R 

be bounded, nonincreasing, continuous and convex. For fixed T > 0, consider the 
problem (13. 3j) . 

(i) If X is RSS, the rule r = T is optimal. 

(ii) If X is LSS, the rule r = is optimal. 

(Observe that the symmetric case is already covered by Theorem 13.91 ) 

Proof. Suppose first that X is RSS. Let L := liminf^ / 5< |j / | <1 yv(dy), and choose 
a sequence 8\ > 5 2 > ■ ■ ■ > so that linn^oo f Sk< \ y \ <1 yv(dy) = L. For each n, 
choose k n so that e n := 5k n satisfies (I3.8p . Now we construct the process X as 
an almost-sure uniform limit of a sequence of processes X n = (X n (t)) t >o, n G IN, 
exactly as in the proof of Lemma 13.111 Then each X n is RSS in the sense of 
Subsection 13.11 (Note that in order to construct the processes X n in this way, 
without their duals, condition (d) in Definition 13. 71 is not needed.) For each t > 0, 
let J- t be the smallest a-algebra containing each a({X n (s) : < s < t}), n e IN. 
Let Qq be the subset of Q on which X n {t) converges uniformly in t. By arbitrarily 
redefining X(t;u) = for u G fi\^o, we see that X is adapted to {J-'t}, and 
clearly X t — X s is independent of J-" s for each < s < t. Thus, by Theorem 13.21 
for any stopping time r relative to {J-'t}, 

E[f(M n (T)-X n (r))] < E [f (M n {T) - X n {T))]. (3.10) 

Now it follows from the uniform convergence of X n to X that, pointwise on 
Q Q , M n (T) — > M(T) and X n {r) — > X(t), and hence, by the continuity of /, 
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f(M n (T) - X n {r)) -> f(M(T) - X(r)) and f(M n (T) - X n {T)) -> /(M(T) - 
X(T)). Thus, taking limits in f[3~T0]) we see via the Bounded Convergence Theo- 
rem that 

E[f(M(T) - X(r))] < E [f(M(T) - X(T))] . 

Therefore, the rule r = T is optimal. A similar argument shows that the rule 
r = is optimal if X is LSS. □ 

Remark 3.13. If we try to extend the above reasoning to unbounded continuous 
/ via the Dominated Convergence Theorem, we run into the difficulty of bounding 
expectations such as E\f(M n (T))\ uniformly in n, since there is no guarantee that 
E\f(M n (T))\ converges to E\f(M(T))\. 

Remark 3.14. It may seem that in Theorem 13.91 we could have weakened the 
SRSS condition similarly, replacing (a) and (b) in Definition 13.71 with f 13 . 9 j) . But 
this would not actually give a weaker hypothesis, since in the presence of condition 
(d), the integral in f)3.5p increases monotonically as e 4- 0. 
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